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Abstract. We describe a testing technique that uses information computed by 
symbolic execution of a program unit to guide the generation of inputs to the 
system containing the unit, in such a way that the unit’s, and hence the sys- 
tem’s, coverage is increased. The symbolic execution computes unit constraints at 
run-time, along program paths obtained by system simulations. We use machine 
learning techniques -treatment learning and function fitting- to approximate the 
system input constraints that will lead to the satisfaction of the unit constraints. 
Execution of system input predictions either uncovers new code regions in the 
unit under analysis or provides information that can be used to improve the ap- 
proximation. We have implemented the technique and we have demonstrated its 
effectiveness on several examples, including one from the aerospace domain. 


1 Introduction 

Modern software, and in particular flight control software like that written at NASA, 
needs to be highly reliable and hence thoroughly tested. NASA software is typically 
tested using system level Monte Carlo or combinatorial simulations. Such system level 
“black-box” simulations have the advantage that they are (a) easy to set up, since the 
user only needs to specify the ranges for the system level inputs, and (b) can be used to 
test software systems that contain COTS (”Commercial-Off-The-Shelf ’), binary or even 
hardware components that are impervious to “white-box” methods. However, system 
level simulations provide few guarantees in terms of testing coverage. Furthermore, 
they may be quite expensive. For example, a run using NASA’s ANTARES simulator m 
may take hours to complete. 

Recently, a new set of techniques lEIMl based on symbolic execution 0 have 
emerged for generating test cases that achieve high code coverage. Symbolic execu- 
tion and its variant, concolic execution, are white-box as they collect constraints based 
on the internal code structure. The collected constraints are solved systematically to 
obtain inputs that exercise all the paths through the code (up to some user specified 
bound). Such white-box techniques are not effective in the presence of COTS or binary 
components; e.g., in such cases, concolic execution may lead to divergence |4]. For 
this reason, and due to the large number of paths to explore and complex constraints to 
be solved, white-box symbolic execution is used most effectively for testing individual 
software units, but not the whole system. On the other hand, when analyzing a unit in 
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isolation, it is often the case that the unit’s inputs need to be constrained by the sys- 
tem calling context, in order to obtain realistic test cases. Encoding input constraints 
requires significant manual effort by developers ED- 

The goal of our work is to find system level test cases that increase the coverage 
of a unit of interest by exploiting a synergy between black-box system simulation and 
white-box unit symbolic execution. We propose an iterative procedure that uses the 
information computed by a symbolic execution of a unit to guide, via machine learning 
techniques, the generation of new system level inputs that increase the coverage of 
the unit, and hence of the system containing the unit. Thus, our approach improves 
on system level testing by increasing the obtained coverage with a reduced number 
of tests, and hence with a reduced cost. It also enables a modular unit level analysis 
under realistic contexts, since symbolic execution is performed along the program paths 
obtained via simulation. 

Specifically, we use data mining techniques (i.e. treatment learning (6|) to obtain an 
approximation of the system level input constraints that influence the satisfaction of the 
unit level constraints computed by the symbolic execution of the unit. Function fitting 
is performed to incrementally approximate the behavior of the unit’s calling context. 
Finally, the unit level constraints are solved with off-the shelf constraint solvers and, 
together with the approximations, are used to guide the generation of new system level 
inputs towards executing uncovered code regions in the unit under analysis. We have 
implemented the techniques in the context of the analysis of C programs. We report 
here on the application of our approach to several illustrative examples, including one 
from the aerospace domain. 

Related Work. The work related to automated testing is vast and we only highlight 
here the work that is most related to our approach. We have already discussed related 
symbolic and concolic execution approaches II7I4I3I8I1 . The work on carving differential 
unit tests from system tests 0 extracts the components that influence the execution of 
a unit and reassembles them so that the unit can be exercised as it was by the system 
test. Differential unit tests are used to detect differences between multiple unit imple- 
mentations; they can not be used to guide the system level inputs to increase coverage. 

In previous work 0 we described a symbolic execution framework that used sys- 
tem level simulations to improve the precision of symbolic execution at the unit level. 
This was achieved in two ways: first, the framework allows symbolic execution to be 
started at any point in the program; thus, the concrete execution of the system can be 
effectively used to set up the environment for the symbolic execution of a unit in the 
system. However, that work could not be used for guiding the generation of new sys- 
tem level inputs to increase the coverage of the unit — which is our contribution here. 
Furthermore, we described in 0 how to use the data collected during system level runs 
to mine constraints on the unit level inputs (using treatment learning or Daikon, for 
example). While this approach would allow more focused unit level testing, it suffers 
from the drawback that the mined constraints can be unrealistically restrictive, and thus 
prevent us to achieve coverage of corner cases in the unit. 
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2 Background 

A Program Model. A program is a tuple P = ( I,A,C ), where I is a set of input 
parameters, A is a set of assignment statements and C is a set of conditional state- 
ments. We assume that the elements of / are of basic types, defined to be a type 
from the set {int, short, unsigned int, char, float, double, enum}, with each ele- 
ment a £ I taking values from a domain D a based on its type; all assignment and 
conditional statements refer to elements in I. The set of all executions of the program 
P is R(P) C {(All C)*} - a set of finite sequences of assignments and conditional 
statements visited over all possible values of the parameters in I. An assignment over 
the parameters in /, called a valuation, is denoted by I and associates every element 
a £ I to a value in D a . Given a valuation I, we assume that all executions of the pro- 
gram visit exactly the same finite sequence of assignments and conditional statements; 
the programs are deterministic. 

Concolic Execution. Concolic execution B4| 101 is a technique that combines concrete 
and symbolic program execution to increase path coverage. Symbolic path constraints 
(PCs) are collected along concrete program runs; the PCs are conjunctions of Boolean 
expressions, each expression representing the condition on the inputs to follow that 
particular path. The conditions in the PCs are systematically negated to generate new 
PCs that are solved with off-the-shelf solvers. The obtained solutions are used as new 
program inputs to run the program along different paths. The process terminates when 
all the paths have been resolved or a user-specified bound has been reached; paths are 
either covered, unsatisfiable or unsolvable due to limitations in the chosen solvers. 

Treatment Learning. Treatment learning 1611 111 is a machine learning technique that 
finds the minimal difference between two sets. In our work, we use treatment learning 
to determine a small number of controllable inputs and ranges (a treatment ) that are 
most likely to lead to some output. 

TAR3 is a treatment learner that finds association rules involving both continuous 
and discrete variables quickly QL. Given a data set and a partition of that set into a set 
of desired data points and a set of all remaining points, TAR3 looks for rules (subsets 
of input parameters and their ranges) that maximize the likelihood of seeing points in 
the desired set. We note that one can use other association rule learners 1121131141 to 
potentially find more accurate rules; however this would come with greater complexity 
and time costs 1151161 . 

Function Fitting. Function fitting finds a predictive relationship between associated 
outputs and inputs (usually one output variable and a small number of inputs). We use 
discrete least-squares function fitting 1 17l 1 81 1 to approximate a relationship between the 
unit inputs and the associated system inputs; the technique is less sensitive to outliers 
than many competing techniques m- Assume y(x) is a complex, non-linear func- 
tion; its approximation can be given by a polynomial p(x ) with coefficients Ci, for 
i £ {1, 2, 3, . . .}. A least-squares solution finds the constant values c, that minimize 
the total Euclidean distance (the residual ) between p(x ) and y{x) at the given measure- 
ments x. If the relationships we are trying to approximate are Lipschitz continuous (or 
smooth ), we can find a polynomial approximation that is arbitrarily close to our desired 
function by the Weierstrass Approximation Theorem |20) . A function that is not smooth 
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Fig. 1 . A system S with inputs / and an embedded unit U with inputs i 


along its entire domain may be locally smooth, or smooth along some subinterval of the 
domain. A polynomial constructed on this subset is known as a piecewise-polynomial 
approximation. Shrinking each subinterval allows for arbitrarily close approximations 
with low-order polynomials 11211 . We use the term Threshold to represent the minimum 
number of data points that we need for function fitting. 


3 Approach 

We illustrate the proposed approach using Figure[Q which shows a System Under Test 
(S) that may have both white-box and black-box components. A white-box unit is a 
code fragment that lends itself to concolic testing. S' is a system with input parameters 
I containing a white-box unit U = (i, A, C) with unit level parameters i. The goal is to 
generate system level inputs I that increase the coverage of unit U. 

Let c £ C denote some conditional statement in U that was not covered during 
system level testing. Let Cons(c) denote the unit level constraint, over parameters in 
i, associated with statement c; this constraint is obtained by the concolic execution 
of U. As an example, if i = {zi, w}, a constraint could be (v > w). We note that the 
concolic execution of U (in isolation) excludes the system that instantiates U ; while this 
is useful for discovering new constraints for the uncovered paths, it may also generate an 
over-approximation of the actual paths that can be covered during system level testing. 
By the same token, paths that are unreachable in U remain unreachable in S\ a path 
unreachable in the most liberal environment for U remains unreachable in S. If Cons(c) 
is satisfiable, then a satisfying valuation i will enable us to cover statement c at the unit 
level, but as mentioned, that statement may still be unreachable at the system level. Our 
goal is to try to generate assignments over the system level parameters I that can cover 
c (and other statements in the unit) during system level testing. 

We note that the calling context for the unit can be represented by some function 
/ such that i = f(I). To discover the new valuations for I, we monitor the values 
of I and i during simulations and use machine learning techniques to approximate /, 
based on the monitored values. Once we have an approximation p of /, we use it to 
solve i = p(I) A Cons(c)~, the solutions for I are the likely candidates to the system 
level inputs that lead to the satisfaction of Cons(c). These valuations are used to start 
new simulation runs, which lead to either covering c or to obtaining a more accurate 
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Program 1 . Prototype Linear Example 

int gl = 1, g2 = 2 ; 

int System (int 11, int 12) { 

if (II >0) gl = 12; else gl = -12; 
g2 = II + 3; 

Unit (12, II); 

} 

int Unit (int il, int i2) { 
if ( il > 0) { 
i2 = g2 ; 

if(i2 > 0) return 0; else return 1; 
} else { 

i2 = gl + 3; 

if(i2 > 0) return 2; else return 3; 

} 

} 


approximation of /. The process is repeated until either the desired coverage is obtained 
or a user-specified bound has been reached. We note here that if the function relating 
I and i is invertible, one can learn an approximation of the form I = p{i) and use the 
solutions of Cons (c) to directly obtain the valuations of I. To simplify the presentation, 
we will assume for the rest of the paper that we have such invertible functions. We 
describe our approach in detail in the next section. 

4 Testing Algorithms 

As a running example, consider the linear code in listing Program Q] Integers II and 
12 are the system inputs, while il and z2 are the unit inputs. The two integer global 
variables gl and g2 are treated as inputs to both System and Unit. The unit inputs are 
therefore il, i2, gl and g 2. 

Constraints Trees. We assume concolic execution achieves full path coverage over 
Unit. The set of path constraints over all executions of Unit are stored in a constraints 
tree T. The constraints tree reflects the set of all paths that were taken by all executions 
of a program unit (assume that the unit has no infinite loops). 


1 [Parameters] 

2 il 

3 g2 

4 gl 
[Tree] 

6 (il > 0) (C) 

7 (g2 > 0) (C) 

8 (g2 <= 0) (S) 

9 (il <= 0) (C) 

10 ( (gl + 3) > 0) (C) 

11 ((gl + 3) <= 0) (S) 


Fig. 2. The constraints tree after some 
rounds of initial testing 


ENTRY 



Fig. 3. A graphical representation of 
Figure [2] Covered nodes are solid circles; 
those not covered are dotted. 
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Figure [2] shows T for Unit after some initial testing. Lines 2-4 list the inputs that 
are constrained. Lines 6-11 contain a textual representation of the tree. The number 
of leaves is equal to the number of path constraints in T; each path constraint is a 
conjunction of the terms encountered along the parent hierarchy starting at each leaf. 
Therefore, given the tree in Fig [2 the set of constraints are: (zl > 0) A (g 2 > 0), 
(zl < 0) A ( g2 < 0), (zl < 0) A ((gl + 3) > 0) and (zl < 0) A (( gl + 3) < 0). Of these 
constraints, (zl > 0) A (g 2 > 0) and (zl < 0) A (( gl + 3) > 0) were covered during 
our initial testing, denoted by the letter “C” within parentheses. The other constraints 
are satisfiable at the Unit level but not covered during system level testing, denoted by 
the letter “S”. 

Observations. Consider again a system S, with system inputs /, and a unit U within 
S, with unit inputs z. We let d = |/|. We assume the unit can be fully analyzed using 
concolic execution. Let T be a constraints tree extracted by monitoring U during system 
level testing. Consider nodes in T that are satisfiable at the unit level but not covered by 
system level testing. We attempt to cover such nodes using a combination of concolic 
execution, treatment learning and function fitting. For a node n in T we take Cons{n ) 
as the unit constraint that leads to n and that when satisfiable will cover n. To present 
our coverage algorithm, we first make the following observations. 

Consider a path a = zii, 712 , . . . , rik in T such that all nodes rii for 1 < i < k are 
covered by system testing. There exist vectors at the system and unit level that witness 
covering each node rii in er; for a set of system vectors Vi that witness covering rz, in 
< 7 , there exist corresponding witnesses V{ of unit vectors. We then have the following 
properties of these witnesses: 

Observation 1 (Monotonicity of Witnesses). For a constraints tree T and a path a = 
ni,ri 2 , ■ ■ ■ ,rik of nodes in T, such that ni, 712 , . . . , zz/,- are covered with witness sets 
Vi , V 2 , . . . , Vfc at the system level and corresponding sets Vi , V 2 , .... Vk at the unit level, 
we have, V\ 3? V 2 3? . . . D U fc and Ui 3 u 2 3 • • • 3 ffe- 

Monotonicity of Witnesses follows easily by noting that Cons(rik) =>• Cons{rik- 1 ) => 
. . . =$■ Cons(ni) for the constraints of nodes in a. 

Observation 2 (Sufficiency of Witnesses). For a constraints tree T and a path a = 
ni, ri 2 , ■ ■ ■ , zife of nodes in T, such that ni,ri 2 , ■ ■ ■ , zzz- are covered with witness sets 
Vi, V 2 , . . . , Vfc at the system level and corresponding sets vi,V 2 , , i’k at the unit level, 
let \Vj\ > Threshold such that for all z € [1, fc] with |Vj| > Threshold , we have 
| Vj | < | Vi | . If the relation between Vj and Vj is smooth for function fitting, then for all 
i>3, the relation between Vi and v t is also smooth for function fitting. 

Consider T and a 0 = ni, ri 2 , . . . , rife in T such that all nodes that precede rifc are cov- 
ered during system testing, but node rife is not covered. Since concolic execution fails 
at the system level, we have that Cons(rik ) is the finest symbolic path constraint, such 
that when Cons(nk ) is satisfiable, the assignment that satisfies Cons(rik ) covers rife 
at the unit level. We take Term(nk) as the term corresponding to rife and Parent(rik ) 
as the parent of rife in < 7 . Given a constraint C, let Vars(C ) be the set of parame- 
ters that appear in the terms of constraint C. The path constraint Cons(rik) is then 
Term(ni)A Term ( 112 ) A. . .A Term(rik)- We would like to learn the system level behav- 
ior as a function /, such that / = /( Vars{Cons{rik)), via function fitting. If Cons(rik) 
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is satisfiable, we can use / to find a system level vector that covers rik using the satis- 
fying assignment over Vars(Cons(jik)) for Cons(rik). The caveat in this approach is 
that function fitting is difficult over large data sets due to both the number of parameters 
involved and due to the presence of discontinuities. We tackle this problem as follows: 

- We function fit for C, starting at Terming), progressively conjoining terms 
Term(rii ) for i = k — 1, k — 2, . . . , 1, stopping when we find a smooth func- 
tion. This reduces the number of unit vectors we consider and by the Sufficiency of 
Witnesses considers the smaller number of data points. 

- We reduce the number of system parameters for function fitting using treatment 
learning. For C, we use the data seen during system testing to find the subset I n C / 
of system parameters that most affect the values of the unit parameters in Vars(C). 

For all terms in Cons(nk) that are not considered in a given iteration of function fitting, 

i.e., terms in Cons(rik) but not in C, we use treatment learning to find satisfying as- 
signments. By the Monotonicity of Witnesses, we have more data points to cover these 
terms than to cover Cons(rik), increasing the likelihood of finding good treatments. 

Algorithm. We now describe Cover, our coverage algorithm presented in AlgorithmQ] 
The algorithm works as follows: 

1. Lines 2-4. We perform n-factor combinatorial Monte Carlo (MC) simulations by 
picking values over a space sp\ a d-dimensional space for the d input parameters 
constrained by their data types. Unlike traditional random MC, n-factor MC gen- 
erates test cases such that every possible combination of input parameters equal to 
size n appears at least once in the test suite 122ft . For every system vector a, we 
monitor the unit and capture the unit vector b together with the path constraint for 
the path taken within the unit. The set of path constraints are summarized in T; 
system and unit vectors are stored in sets V and v. 

2. Lines 7—11. We traverse the nodes in T in breadth first order. The treatment learner 
learns a treatment for each node n in T as long as its sibling is also covered. Since 
the treatment learner is a contrast set learner, it can be used to identify a set /„ C I 
and ranges R n of parameters in I n , only when given data points that differentiate n 
from its sibling. 

3. Lines 13-16. For each satisfiable node n in T not covered by MC simulations, 
we store the assignment i satisfying Cons(n). We start with a constraint C set to 
Terra (n) and progressively strengthen C until we find a system vector to cover n. 
As we want to fit a function that maps I to i, we keep track of the parameters in C in 
i n and the restriction of i to the parameters i n in i n . The function ComputeMap 
finds a function f n such that /„ = f n (i n ) using function fitting. 

4. Lines 17-19. We iterate over all satisfiable nodes n in T not covered during system 
testing. For each such n we run a system level test by composing a system vector 
as follows: (a) take I n = f n (i n ) such that it is consistent with the ranges r 3 for 
all j £ I n as returned by the treatment learner in Line 10 and (b) for all other 
system level parameters j £ I \I n , pick a value from the ranges r 3 returned by the 
treatment learner in Line 10. 
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Algorithm 1 . Cover{S , U) 


input : System S with inputs I with d = |/|, unit U with inputs i 

1 sp <— IR d ; 

2 Perform n-factor combinatorial MC simulations over space sp; 

3 ( V , v) <— {(a, b) I a is a system level vector and b is the corresponding monitored 
unit level vector}; 

4 T (PC from [/); 

5 repeat 
V T; 

// Do BFS on T 
for ( node n in T using BFS) do 

if (n and n’s sibling are covered) then 

// Use contrasting data to learn a treatment 
FV{ogE|« covers n} and V" <- V \ V'; 

{In, Rn, -) <- RunT AR3{I , V, V', V")-, 

Wj G In store the range r , G R n for j ; 


7 

8 

9 

10 

11 

12 

13 

14 

15 

16 


17 

18 


else 


if (n is satisfiable but not covered) then 

// Compute f n such that I n = fn{in) 
i <— model for Cons(n ); 

C <— Term{n)\ 

{In, in, fn) <— C omputeM ap{C ', I, V, v, n, Parent{n), i)\ 


// Build new test-cases 
for (n in T satisfiable but not covered) do 

Run S with a consistent valuation using f n {i n ) and Vj G I \ I n using rj 
from Line [TO] 

T' <- V U (PC from UY, 

T <- T'\ 


21 until (T has no unprocessed nodes)'. 


The function fitting algorithm C omputeM ap, shown in Algorithm [2] works as 
follows: 

1. Lines 1-4 We compute i n occurring in C and the restriction of the model i, for 
Cons{n), to i n . We use treatment learning to isolate a set I n C / most likely to 
affect i n and to determine if the data points in V and v have a smooth relationship. 

2. Lines 5-6 If the relationship is smooth we build the map f n such that /„ = f n (i n ). 

3. Lines 8-10 If the relationship is not smooth, we strengthen C by including the 
parent term from Cons{n) and then recursively call C omputeM ap. 

4. Lines 12-22 If we cannot find a smooth relationship by including all terms in 
Cons{n ), then we use the Sufficiency of Witnesses to walk up the parent hier- 
archy of n to reach a node n" that has at least Threshold data points that witness 
covering n". By Assumption 2, we have at least one path that was taken through 
the unit during system testing. If we find two data points that covered a node in the 
parent hierarchy of n, we attempt a linear fit and return. If we cannot find at least 
two data points, we run more MC simulations. 


302 M. Davies, C.S. Pasareanu, and V. Raman 


Algorithm 2. C omputeM ap(C , I, V , v, n, n', i) 

input : Constraint C such that Cons(nk) => C, system inputs I, system vectors V, 
unit vectors v, a node n that we want to cover, a node n! that is in the parent 
hierarchy of n and a model i for Cons(n) 
output: (In, in, fn ) where I n = f n (i n ) and i„ = Vars(C )) 


1 in <— Vars(C)\ 

2 <— restriction of i to i n ; 

// Find a subset of I for function fitting 

3 V' <— {a £ V | a is in 20% of points closest to Cons(n)} and V" <— V \ V'\ 

4 (In, Rn, smooth) <— RutiT AR3(I , V, V' , V")\ 
s if (.smooth) then 

6 | Build map I n = fn(in)\ 

i else 


8 

9 

10 

11 


12 

13 

14 

15 

16 

17 

18 


// Strengthen constraint and try again 
if (n! exists) then 

C <— C A Term(n'); 

(In , in , fn ) <— C omputeM ap(C ', I ,V, v , n, par ent(n') , z); 

else 

// If no smooth relation between I n and i n , then 
walk up the parent of n, pick a node with 
Threshold points, and attempt a linear fit 
n" <— n\ 

while (Parent(n") exists ) do 

C <— C A Term(Parent(n"))\ 
n" <— Parent(n")\ 

V' <— {a G V | a covers n"}; 
if C| V' | > Threshold) then 
^ break; 


19 

20 
21 
22 


V" <—V\ V 

(In, Rn, -) <- RunT AR3(I , V, V', V "); 
i n <— Vars(C); 

Build map I„ = f„(i n ); 


We use the treatment learning algorithm TAR3, presented in Algorithm [3] for the fol- 
lowing two purposes in our coverage algorithm: 

Learning Rules for Covered Nodes. We use TAR3 to determine the subset of system 
inputs and their ranges that covered nodes at the unit level. For every node n that was 
covered during system testing, if its sibling was also covered, then we have a partition 
of the data points at the system level into one set that covered n and the other set that 
covered its sibling. We use TAR3 with these partitions to learn rules that will either visit 
n or its sibling; Line 10 of AlgorithmQ] We use these rules at Line 18 to pick values for 
a subset of I as described in the algorithm. 
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Algorithm 3. RunTAR3(I, V, V\ V") 

input : System level parameters I, system level vectors V and contrast sets V' C V 
and V" = V \ V' . 

output: (/', R, smooth) where I' C I, R is a set of ranges for each parameter in /, 
smooth is set to true by examining the output 

1 Call TAR3 with V, V' and V " ; 

2 Compose I' C I, R and smooth based on the results of running TAR3; 

3 Return {!' . R , smooth ); 


Learning Inputs for Function Fitting. We attempt to fit a function to cover node n 
using a weak C initially set to Term(n). This C is progressively strengthened as seen 
in Algorithm [2] For each C, we construct contrast sets by partitioning the data points 
into a.) the 20% of the data points nearest in Euclidean distance to the PC boundary and 
b.) all remaining points. These sets are used to learn a small subset of I most influencing 
i close to the PC boundary. We use this reduced subset of / for function fitting. 

As an example, in Figure [4] the desired i are represented by the gray rectangle in 
the center of the plot. Curves are built from data pairs seen during program execution; 
dotted circles surround the data nearest the PC boundary and comprise a contrast set. 
TAR3 returns the I that most affect the i near the PC boundary. We also use TAR3 to 
determine whether a smooth relationship exists between subsets of i and I. In Figure|4] 
the relationship between i and I appears to be discontinuous. To each side of the PC 
boundary, a small variation in system values leads to a large variation in the unit values; 
it is possible to get two different unit values for the same system level value. 



System Variable Value (I) 


* 

* 



0 12 

12 


Fig. 4. A non-smooth relationship between 
a system and a unit parameter. The gray re- 
gion represents values of i not seen during 
testing. Dotted circles surround data closest 
to the boundary. 


Fig. 5. Bars outline a rule that guides execu- 
tion through Node 2. Data points (asterisks) 
are boxed if the runs pass through Node 2. 
The dotted oval outlines a contiguous re- 
gion that suggests /2 is smooth. 


Discussion. We now discuss the assumptions made in our coverage algorithm and also 
the conditions under which the algorithm makes progress. We make the following as- 
sumptions in our coverage algorithm: 

1 . The unit U can be analyzed using concolic execution, 

2. At least one path in U is taken during system testing. 
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The first assumption is required since our goal is to use unit level concolic execution 
to improve system testing. The second assumption may be satisfied using one of the 
following two approaches: 

1 . Iteratively choose smaller systems that enclose U, until we find a system such that 
at least one path is taken in U during system testing. 

2. Pick the earliest method U' up the call chain of U that has at least one path covered 
during system testing and then run Cover(S , U'). This increases the test vectors 
that explore U' and hence the likelihood of taking paths in U. 

We remark that by using a breadth first exploration of the constraints tree, we ensure 
that when we attempt to cover a node, all its parent nodes have been processed. This 
ensures that when we build a system level vector for a node n, we have learnt ranges for 
all nodes in its parent hierarchy; the system level vector is composed using these ranges 
and the function f n . 

Remark 1 (Progress). In the presence of perfect function fitting, if we have an over- 
approximation of the subset of I n that affect the i n = Vars(Cons(n)) for every node 
n that is satisfiable at the system level, then the algorithm will eventually cover n. 

Consider a satisfiable node n that cannot be covered by considering any constraint 
weaker than Cons[n). As we strengthen the C from Terra (n) to Cons(n), we even- 
tually include in C all terms from Cons{n) and all i n in Vars(Cons(n)). If we find 
a perfect function /, such that I n = f(i n ), and if /„ includes all the I that affect i n , 
we are guaranteed to cover n. We use TAR3 to extract /,, . We can supplant TAR3 with 
static analysis techniques, such as El, to learn an over approximation of the set I n . 
Note that due to loops or recursion, our algorithm may not terminate. 

5 Experience 

In this section, we present our experience using the technique proposed in this paper 
on several examples. Two of these examples are purely illustrative, the third is a clas- 
sic aerospace example. Planned experiments include larger aerospace examples: flight 
control software for unmanned aerial vehicles and a prototype conflict detection and 
resolution algorithm. 

Our algorithms are implemented in the context of analyzing C code. We use MAT- 
LAB scripts to generate an initial suite of system vectors V given the known I, and to 
execute programs instrumented for concolic execution. The concolic execution frame- 
work is implemented using CIL (24), the C Intermediate Language, that provides an 
API for the analysis of C programs, to instrument user code. We use CIL to walk the 
intermediate representation of the program and insert calls to a set of runtime listen- 
ers. The user program is then re-generated from the intermediate representation, linked 
with our runtime library and run. During MC simulations, we use the instrumented ver- 
sion of the unit to monitor unit and system inputs and to capture paths that were taken 
within the unit. The constraints tree generated during MC simulations is used as an 
input to a subsequent solve cycle, where we solve for paths not taken within the unit 
during system level testing, replay solutions found and thus explore the tree to com- 
pletion; we solve path constraints using Yices (25). The outputs of these steps are a 
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fully explored constraints tree T together with models for all satisfiable paths, a set of 
unit vectors v and the corresponding system vectors V that we monitored during MC 
simulations. These outputs are fed to MATLAB scripts that use 7, T, i, V and v to per- 
form treatment learning and function fitting, and to predict new I that better cover T in 
subsequent iterations. Two steps in our current process are manual, and we have plans 
to automate both: a) determining whether TAR3’s treatments suggest smooth functions, 
and b) choosing whether to begin execution of the new I. 

A Piecewise Linear Case Study. We will first use the simple, piecewise linear imple- 
mentation in ProgramQ] Although the f n for this example can be found by hand or by 
symbolic execution, we use it here to illustrate our technique. Unit is instrumented to 
perform concolic execution and graphical results are shown in Figure[3] All invocations 
of Unit begin at Node 1 in Figure 0 Control flow from Node 1 is determined by fi, 
which is il = 72. If 12 > 0, control flow passes to Node 2; otherwise, to Node 5. For 
demonstration, we treat fi as unknown, and determine it using our heuristic methods. 

We initially create 25 test cases using values for J1 and 12 between -2 and 2 (Al- 
gorithm 1, Lines 2-4). Nodes 4 and 7 within Unit are not covered; concolic execution 
provides the unit input constraints that will cover them. Figure [2] Lines 2-4 give the 
required unit level parameters: g2, gl, and il. Lines 6-11 show T for Unit; Line 11 
corresponds to Node 7, and has an ‘S’ to show that the constraint is satisfiable at the 
unit level. 

The generated constraint tree is traversed using breadth-first search (Algorithm 1, 
Lines 7-16). Lines 6 and 9 in Figure[2]indicate covered sibling nodes (Algorithm 1, Line 
8); TAR3 automatically returns the rule set for passing through Node 2, (0.5 < 12 < 2), 
as shown by parallel bars in Figure[5] Similarly, TAR3 discovers (—2 <7 2 < 0.5) for 
passing through Node 5. Note that TAR3 does not capture the exact location of the 
constraint boundary between Nodes 2 and 5. TAR3 can not learn system constraints for 
Nodes 3 and 6 as there is no contrasting data. 

TAR3 is then used to reduce the subset of values of 7„ for function fitting. Contrast 
data sets are built by isolating the 20% of unit input data nearest the constraint boundary. 
For Node 4, TAR3 suggests that g 2 depends on a smooth relationship involving only 71. 
To cover Node 7, our approach first considers all data satisfying the weakest constraint 
(gl < —3); TAR3’s results are in Figure[6] The data nearest in value to the constraint 
boundary are spread discontinuously across II and 12 space. TAR3 makes a prediction 
involving a subset of the points. This happens when the the relationship between i and 
I is not smooth; in this case, the relationship between gl and 12 has a discontinuity 
at 71 = 0. The constraint is strengthened by considering the data satisfying il < 0 A 
gl < —3. By the Monotonicity of Witnesses, this yields fewer data points; there are a 
total of 15 data points passing through Node 5. TAR3 now suggests there is a smooth 

5 i = / 7 (n,/2). 

For Node 7 the exact solution gl = 12 is predicted using function fitting (Algo- 
rithm^, with an error less than 10 -15 . For Node 4 the solution g2 = Il + 3 is pre- 
dicted with an error less than 10 -14 . These approximations, along with the previously 
discovered system level constraints (Algorithm 1, Line 10), enable building new test 
inputs for 71 and 12 to cover Nodes 4 and 7 on the next test iteration (Algorithm 1, 
Lines 17-19). 


306 M. Davies, C.S. Pasareanu, and V. Raman 


Program 2. The System Function in the Prototype Quadratic Example. The Unit Func- 
tion is the same as in Program Q] except that the Unit Function for this case expects 
inputs of type double. 

double gl=1.0, g2=2.0; 

int System (double II, double 12) 

{ 

if (II > 0) gl = 12; 

else gl = -12; 

g2 = 11*12+3.0*11*11+12*12; 

Unit (12, II); 

} 


A Piecewise Quadratic Case Study. As a simple example of how our technique could 
be used in the presence of nonlinear constraints (that are not typically handled by off- 
the-shelf solvers), we propose the example in Program [2] Program [2] and Program Q] 
differ in the use of doubles instead of ints and the nonlinear assignment formula for 
g2 before Unit is called. T is identical to the one given in Figure [2] and Figure [3] A 
breadth-first search over covered nodes gives identical results to the previous section. 

TAR3’s results for Node 4 are shown in Figure^ The treatment was unable to bound 
all of the contiguous boxed data; this suggests that fi is smooth but nonlinear. 
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Fig. 6. Node 7’s treatment 


Fig. 7. Node 4’s treatment 


Function fitting is applied for Nodes 4 and 7. Node 7’s results are identical to those 
in the previous section. For Node 4, function fitting gives a residual error of less than 
10 -15 and the exact solution g2 = 3.0 * 1 1 2 + 12 2 + II * 12. Our algorithm first 
attempts to create an I that satisfies g2 < 0 and is consistent with the system parameters 
and ranges learned previously (Line 10 of Algorithm [I]), but discovers that there is 
an inconsistency. There are no real roots that satisfy the constraint for g 2 given fa 
and the range constraints for Node 4’s parent (Node 2). Function fitting for Node 2 
yields the exact result il = 12. By simple substitution the correct system constraint is 
12 > 0. An examination of Node 4’s constraint reveals that the two system constraints 
are unsatisfiable; no system test leads us to Node 4. 

An Aerodynamics Case Study. In this aerodynamics case study the code predicts the 
drag coefficient Cd, as calculated by the USAF Stability and Control DATCOM man- 
ual (26); it can be found at https://c3.nasa.gov/dashlink/projects/ 
57/#c0 Cd is used in the yaw control law for a supersonic aircraft designed to fly 
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between 30,000 and 80,000 feet at Mach numbers M between 0.8 and 3.0. M is a ra- 
tio of the plane’s airspeed to the speed of sound, and is calculated by measuring two 
different pressures, P t and P s . The system I consists of three arguments from sensors: 
P t , P s , and the altitude Alt. This sensed data is used to calculate M, compressible and 
incompressible skin friction coefficients Cf and Cfb, and the corresponding terminal 
skin friction coefficients CfT and CfbT. For subsonic (M < 1) compressible flow in 
air, M is given by Equation Q} for supersonic (M >= 1) flow, M is found implicitly 
using the Rayleigh Pitot tube formula l27l . shown here as Equation[2] 


M = 


\ 


For Equation [2 


0.4 

1.4 


~A~ l -l 


(1 > 7 .- 


5.76 M 2 


3.5 


5.6M 2 - 0.8 / 
there is no explicit formula for M given P t , P s 


2.8 M 2 - 0.4 
2 A 


( 2 ) 


One code component 

uses Newton’s Method to solve Equation^ and is used as a black box for our technique. 
Cf, Cfb, CfT and CfbT are complicated nonlinear functions of M and Alt l26l . The 
unit calculates Cd based on the skin friction and the base drag. The relationships be- 
tween Cd and the unit inputs are nonlinear, but the constraints defining the relationships 
are linear and easy to both discover and solve using concolic execution techniques. 


[Parameters] 

2 CfbT 

3 Cf 

4 M 

5 CfT 

6 Cfb 
[Tree] 

8 (Cf > CfT) (C) 

9 (M >= (780000 / 1000000)) (C) 

10 (M > (1040000 / 1000000)) (C) 

11 (M >= (600000 / 1000000)) (C) 

12 (Cfb > CfbT) (C) 

13 (M >= 1) (C) 

14 (M <= (2000000 / 1000000)) (C) 

15 (M > (2000000 / 1000000)) (C) 

16 (M < 1) (S) 

17 (Cfb <= CfbT) (S) 

18 (M < (600000 / 1000000)) (S) 

19 (M <= (1040000 / 1000000)) (S) 

20 (M < (780000 / 1000000)) (S) 

21 (Cf <= CfT) (S) 


Fig. 8. The constraints tree after seven rounds of initial testing 


We begin our testing of the system by looking at nominal ranges for the aircraft: Alt 
between 30 and 80 thousand feet, P t between 0.0145 and 25, and P s between 0.00971 
and 3.5. Performing 2-factor combinatorial testing (28l with 5 bins for each of these 
parameters gives 9 initial test cases. Two of these cases have P t < P s , a physical 
impossibility, and are thrown out. 

The constraints tree T for our 7 initial test cases covers only 2 paths through the 
tree, as shown in Figure [8] T is traversed using a breadth-first search. For the nodes 
at lines 21 and 17 of Figure [8j TAR3 suggests a smooth relationship between the unit 
parameters and the system parameters P s and Alt. For the nodes at lines 16 and 18-20, 
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TAR3 suggests a smooth relationship between M and the system parameters P t and 
P s . Function fitting is performed for the nodes not covered by system testing, using all 
7 initial data points, giving the approximation M = 5.7022 + 0.0035 * P f 2 — 0.0092 * 
P s * Pt + 0.7255 * P£ — 0.0124 * Pt — 3.4665 * P s with a residual of 0.0479. This 
process is repeated to find approximations between the unit parameters Cf, Cfb, CfT 
and CfbT, and the system parameters P s and Alt that were implicated by TAR3. 

Constraint solving is then used to find test inputs for each node not covered in T. 
The result is 17 new I, which are used for new simulations. Concolic execution records 
the paths taken through the unit; the resulting T has 5 covered paths with 21 covered 
nodes and 12 nodes not covered — only 5 of the nodes not covered are satisfiable. When 
the new T is compared against the one in Figure [3j the constraints at lines 17, 19 and 
21 are covered. After two rounds of testing, our method uses 24 tests to illuminate a 
constraints tree with 21 covered nodes and 12 nodes not covered. 

We compared our technique against state-of-the-art black box testing by generating 
a test suite with 25 n-factor combinatorial tests; n-factor combinatorial testing typically 
obtains better coverage than random Monte Carlo testing I I22I29II . With a comparable 
number of tests (24 vs. 25) our technique achieves significantly higher coverage (21 
covered nodes) than the coverage obtained by n-factor combinatorial testing alone (16 
covered nodes). 

6 Conclusion 

We described a testing technique that combines the strengths of black-box system simu- 
lation with white-box unit symbolic execution to overcome their weaknesses. The tech- 
nique uses machine learning, function fitting and constraint solving to iteratively guide 
the generation of system-level inputs and increases the testing coverage. We showed in 
the experience section that we could use our tool to increase coverage of a unit using 
fewer test cases compared to state-of-the-art combinatorial testing. System level simu- 
lation can be expensive, and using information from white-box techniques allowed us 
to significantly decrease the time cost. White-box techniques, like concolic execution, 
may not scale to a full system. This is especially true when the system either contains 
non-linear components or contains components for which the source code is unavail- 
able. Covering each white-box unit separately is an option, but there are likely to be 
test cases which are not possible given the constraints of the full system. As an exam- 
ple, the values of the Mach number and the friction coefficients in our aerodynamics 
case study are constrained by the measured values of the pressures P t and P s and the 
altitude. This means that, even though the Mach number and the friction coefficients 
are treated as independent inputs to our unit, the values of these variables cannot truly 
vary independently. If we performed only unit-level full coverage, we may miss dead 
code that is unreachable given the system, or we may spend too much time exploring 
behaviors in the unit that are not possible given the unit’s true calling context. In the 
future, we plan to study alternative approaches to machine learning (e.g. Daikon) and 
to perform a thorough evaluation of the technique to determine its utility in practice. 
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